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CHAPTER  I 


INTRODUCTION 


The  significant  improvement  in  Navy  parameter  identifica- 
tion capability  has  been  demonstrated  in  recent  aircraft  flight 
test  programs.  The  utilization  of  this  capability  has  brought 
the  parameter  identification  algorithms  to  a point  where  restric- 
tions to  the  applications  of  this  technology  occur  because  of  im- 
precisely defined  a priori  model  forms  and  poorly  excited  modes 
[1-3]. 


The  problem  of  ill-defined  model  structures  occurs  naturally 
as  the  application  of  the  system  identification  technology  is 
extended  to  more  complex  Navy  systems  C©*g*»  surface  and  subsur- 
face marine  vehicles,  aircraft,  inertial  navigation  components). 
Additional  complexity  arises  in  those  operation  regimes  which  are 
characterized  by  significant  variation  in  governing  physical 
phenomenon  Ce*g-.  separation  of  flow  at  high  angle-of-attack) . 

This  problem  occurs  because  of  two  main  reasons:  (1)  these  systems 


often  cannot  be  well  analyzed  by  theoretical  methods  and  the  assump- 
tions of  the  analyses  are  questionable  in  extreme  regimes,  and  (2) 
even  if  a reasonable  theoretical  investigation  could  be  carried  out, 
the  resulting  model  may  be  too  complex  to  be  useful  for  engineer- 
ing or  simulation  objectives.  To  complement  theoretical  models, 
it  is  therefore  desirable  to  have  techniques  for  estimating  the 
structure  of  linear  and  nonlinear  models,  of  varying  complexity, 
from  test  data.  Such  model  structure  determination  algorithms, 
when  applied  to  a set  of  data,  could  give  the  simplest  model  which 
substantiates  the  datav  'The  resulting  models  may  be  used,  for 


tt 


example,  in  control  system  design,  handling  qualities  evaluation, 
stability  analysis,  and  simulation  verification. 


Proper  excitation  of  significant  modes  is  necessary  to  ef- 
fectively estimate  both  model  structure  and  the  parameters  of  that 


s 


model  structure.  Specification  of  the  test  procedure  is,  thus, 
of  paramount  importance  in  high  order  systems  and  in  complex 
nonlinear  systems.  When  there  are  many  important  modes,  a conven- 
tional input  (e.g.,  step,  pulse)  may  not  excite  all  the  modes 
leading  to  identifiability  problems.  In  addition,  in  nonlinear 
systems,  it  may  not  be  feasible  to  "hold"  the  system  in  any 
particular  regime  for  a sufficient  time  to  acquire  enough  data, 
because  of  safety  and  stability  considerations.  Therefore,  inputs 
which  are  effective  in  giving  good  identifiability  of  parameters 
under  amplitude  or  time  constraints  are  important  for  practical 
identification  applications. 

This  report  describes  the  work  in  areas  of  model  structure 
determination  and  input  selection.  Special  emphasis  is  given  to 
their  application  to  nonlinear  systems. 

1.1  PRINCIPAL  DEVELOPMENTS 


Thw  work  conducted  under  the  present  effort  has  produced 
many  theoretical  and  algorithmic  developments . The  main  results 
are  summarized  here. 

The  following  theoretical  results  are  obtained: 

(1)  Optimal  Selection  of  Parameterization  in  Estimation 

Problems":  Methods,  based  upon  the  likelihood  function, 

have  been  developed  for  selection  of  a set  of  parameters 
which  best  describe  the  observed  data.  Extraneous  para- 
meters are  dropped. 

(2)  Model  Structure  Estimation  Using  Splines:  Use  of  poly- 

nomial  splines  in  representing  unknown  nonlinearity 
and  time  variation  is  investigated.  System  identifica- 
tion methods  based  upon  such  representation  are  pre- 
sented. 

(3)  Input  Design  for  Model  Discrimination:  Inputs,  which 

enhance  differentiability  among  many models  based  upon 
data,  are  studied  and  methods  to  select  such  inputs  are 
investigated. 
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In  addition  to  these  theoretical  results,  several  results 
are  obtained  which  should  improve  current  system  identification 
algorithms.  These  include: 

Cl)  Improved  Techniques  for  Isolation  of  Poorly  Identifi- 
able Parameters;  The  parameters .which  are  only  margin- 
ally identifiable,  can  be  discarded  with  improved  relia- 
bility with  the  development  of  better  parameterization 
techniques . 

C2)  Optimal  Subset  Rej^ression  on  Spline  Representation: 
Method  for  determining  a useful  spline  representation 
through  the  application  of  the  optimal  subset  regres- 
sion technique  have  been  developed. 

(3)  Subopt imal  Multistep  Input:  A practical  technique  for 

designing  suboptimal  multistep  inputs  for  linear  and 
nonlinear  systems  has  been  developed. 

Both  the  theoretical  and  the  algorithmic  developments  have 
advanced  state-of-the-art  of  system  identification,  in  particular, 
its  application  to  the  formulation  of  mathematical  models  for 
Navy  vehicles. 

1.2  SUMMARY 


This  report  is  organized  as  follows.  Chapter  II  discusses, 
in  brief,  the  previous  results  in  the  areas  of  model  structure 
determination  and  input  design.  New  model  structure  estimation 
methods  are  developed  in  Chapter  III,  followed  by  the  input  design 
techniques  in  Chapter  IV.  Chapter  V gives  the  conclusions. 
Applications  of  these  algorithms  to  testing  of  Navy  vehicles  and 
other  systems  is  described  in  other  reports  and  technical  papers. 


CHAPTER  II 


REVIEW  OF  PREVIOUS  WORK 


2.1  INTRODUCTION 

The  problem  of  model  structure  estimation  and  the  importance 
of  selecting  appropriate  test  signals  have  long  been  recognized. 
Only  preliminary  work,  however,  has  been  done  on  these  problems 
because  even  the  basic  techniques  for  parameter  estimation  were 
unavailable  until  recently.  Most  of  the  work  on  model  structure 
determination  and  input  designs  until  now  has  been  applicable 
only  to  linear  systems.  Previous  model  structure  determination 
and  input  design  methods  are  now  discussed. 

2.2  TECHNIQUES  FOR  MODEL  STRUCTURE  DETERMINATION 

Methods  for  isolating  a model  structure  from  the  response 
data  have  been  developed  mainly  for  linear  systems.  In  linear 
systems,  the  problem  of  model  structure  estimation  is  simpler 
because  the  structures  of  multiinput/multioutput  linear  systems 
is  completely  defined  by  a finite  set  of  real  integers  called 
the  canonical  indices  [4-6]. 

The  current  techniques  for  finding  the  canonical  indices 
from  measured  data  use  various  approaches.  Tse  and  Weinert  [7] 
use  the  fact  that  Hankel  matrices  of  orders  higher  than  a certain 
value  are  singular.  They  compute  the  highest  order  of  the  Hankel 
matrix  whose  determinant  is  nonzero.  This  gives  one  of  the 
canonical  indices.  This  process  is  repeated  for  all  canonical 
indices.  Such  a method  may  work  well  if  the  number  of  inputs  and 
outputs  is  small  and  the  number  of  data  points  is  large.  Often 


however,  it  is  unsatisfactory  because  statistical  properties  of 
the  determinant  of  the  Hankel  matrix  are  not  known. 

Akaike  [8]  has  also  developed  a method  for  determining  model 
structures  of  linear  systems.  The  method  selects  canonical 
indices  based  upon  the  minimization  of  the  prediction  error  of 
the  outputs  (a  criterion  called  the  final  prediction  error  cri- 
terion is  used).  It  is  shown,  that  in  the  stochastic  case,  the 
dimension  of  this  predictor  space  is  equivalent  to  the  system 
order.  This  method  has  been  applied  quite  extensively  with  some 
success  [9]  and  some  problems  [10].  Modifications  to  solve  some 
of  the  problems  with  the  Akaike  final  prediction  error  criterion 
have  been  suggested  [10] . Similar  criteria  have  been  suggested 
by  Parzen  [11]  and  others.  Chi-square  and  statistical  F-ratio 
tests  may  also  be  applied. 

Estimation  of  model  structures  in  nonlinear  systems  is  a 
relatively  newer  field  and  few  results  are  available.  The  pio- 
neering work  in  this  area  was  done  by  Hall  and  Gupta  [12].  They 
used  a general  polynomial  representation  for  the  unknown  nonlinear 
relationships  and  an  optimal  subset  regression  approach  to  esti- 
mate the  unknown  nonlinear  aerodynamic  effects  at  high  angle-of- 
attack.  The  results  were  applied  to  flight  data  from  an  F-4 
aircraft  with  good  success.  This  procedure,  however,  has  limita- 
tions. The  polynomials  may  be  inadequate  for  approximating  some 
nonlinearities  commonly  encountered  in  physical  systems.  In 
addition,  polynomial  models  may  be  computationally  marginally 
stable.  Also,  the  coefficients  of  the  polynomial  terms  may  have  an 
unacceptably  large  correlation.  It  is  apparent,  therefore,  that 
the  parameters  have  to  be  carefully  selected  in  nonlinear 

estimation  problems,  resulting  in  a need  to  use  a better  repre- 
sentation for  the  nonlinearities. 


2.3  TECHNIQUES  FOR  INPUT  DESIGN 


Techniques  for  input  design  have  been  developed  only  for  the 
linear  systems.  The  selection  of  test  signals  was  first  con- 
sidered by  Goodwin  [13],  Reid  [14-] , and  Mehra  [15].  Goodwin  and 
Reid  worked  with  the  parameter  covariance  matrix  and  tried  to 
optimize  it  directly.  The  optimization  problem  was  so  complex 
that  only  very  simple  problems  could  be  solved.  Mehra  [IS]  worked 
with  the  trace  of  the  information  matrix  which  gave  a simpler 
optimization  problem,  but  the  inputs  were  not  very  useful.  Later, 
Gupta  and  Hall  [16],  Mehra  [17]  and  Mehra  and  Gupta  [18]  were 
able  to  simplify  the  procedure  for  optimizing  various  functions 
of  the  covariance  matrix.  This  made  it  possible  to  solve  fairly 
complex  problems,  but  the  computation  time  was  still  quite  high. 
Recently,  G.  Reid  [19]  has  shown  how  sums  of  Walsh  functions  may 
be  used  to  further  reduce  the  computation  time. 

2 . 4 SUMMARY 


Past  work  on  input  design  and  model  structure  estimation 
has  been,  in  a great  part,  limited  to  linear  systems  and  simple 
criteria.  There  is  a need  to  extend  these  methods  to  produce 
new  algorithms  for  nonlinear  systems.  In  addition,  the  techniques 
for  linear  systems  need-  to  be  futher  developed  so  that  they  are 
computationally  feasible  for  modestly  complex  systems. 


CHAPTER  III 


MODEL  STRUCTURE  DETERMINATION  TECHNIQUES 
3.1  INTRODUCTION 

The  development  of  a useful  mathematical  model  from  a given 
response  data  involves  two  major  steps:  Cl)  model  structure 

estimation,  and  (2)  parameter  identification.  The  parameter 
identification  problem  has  been  a subject  of  research  of  many 
authors  leading  to  several  techniques  which  can  be  applied  to 
linear,  as  well  as  nonlinear,  systems  [20-22]. 

The  model  structure  estimation  problem  has  to  be  treated 
separately  for  linear  and  nonlinear  systems.  In  linear  systems, 
this  problem  simplifies  considerably  because  the  basic  structure 
of  linear  models  is  completely  specified  by  a finite  set  of  real 
integers  called  canonical  indices.  Therefore,  evaluation  of  this 
small  set  of  real  integers  completely  determines  the  structure 
of  linear  models.  Techniques  for  estimating  canonical  indices 
from  response  data  have  been  developed  by  Akaike  [8] , Tse  and 
Weinert  [7]  and  Vanden  Boom,  et  al.  [23],.  as  discussed  in  the 
previous  chapter. 

Absence  of  such  convenient  indices  in  nonlinear  systems 
makes  the  problem  of  determining  model  structure  particularly 
difficult.  It  is  to  be  mentioned  that  the  estimation  of  a com- 
pletely unknown  nonlinear  function  from  data  contaminated  with 
noise  is  not  a well-posed  problem  [24].  Therefore,  the  best  that 
can  be  hoped  is  to  get  a good  approximation  for  the  nonlineari- 
ties . 

During  the  course  of  the  present  effort,  two  methods  have 
been  developed.  The  first  method  starts  off  with  the  most  complex 
model  of  the  system  which  may  be  expected  from  physical  or  other 


problems  during  estimation.  Therefore,  useful  parameterization 
must  be  selected  such  that  the  primary  parameters  are  estimated 
with  maximijm  accuracy  and  the  model  is  manageable.  It  should  be 
noted  that  dropping  some  of  the  secondary  parameters  will  give 
biased  estimates  of  the  primary  parameters.  These  biased  esti- 
mates, however,  have  a lower  mean- square  estimation  error  if  the 
parameterization  is  selected  carefully. 

We  begin  with  two  examples  in  Section  3.2.1.  General  nonlin- 
ear systems  are  considered  in  Section  3.2.2.  Section  3.2.3  gives 
the  summary  and  the  application  of  the  method  to  the  general  model 
structure  determination  problems. 


To  motivate  the  results  in  the  next  sections,  two  examples 
are  now  given. 


Example  3.1 

Consider  a nonlinear  regression  example  in  which  the  input  u 
and  the  output  y are  related  by  the  equation 


V is  random  noise  with  unit  variance.  This  is  the  true  model. 
Suppose,  we  are  interested  in  determining  parameter  'a'  accu- 
rately. Then  the  primary  model  is 


Several  input/output  measurements  at  (u^^^,  y^)  , i = 1,2,  ...,  r 
are  taken.  Then  if  the  true  model  is  used  the  parameter  estimates 
and  the  covariance  of  estimation  errors  are 
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If  the  primary  model  is  used,  the  estimate  of  'a’  is 
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The  variance  of  this  estimate  is 
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However,  this  estimate  is  biased.  The  bias  in  the  estimate  is 
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The  mean-square  error  of  this  estimate  is 


MSE  (a.  ) - ^ 
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ESTIMATOR  USING 
TRUE  MODEL 


The  most  interesting  result  is  obtained  for  0.195  < b<  0.58. 
For  this  range  of  b values,  using  the  measurements  at  two  inputs 
u^  » 1,2  with  the  simplified  model  gives  lower  mean- square  error 
than  the  case  where  all  three  measurements  are  used  either  with  the 
true  or  the  simplified  model.  In  other  words,  for  the  purpose  of 
estimating  a,  the  third  measurement  should  be  rejected.  This  is 
an  example  where  the  measurement  contains  information,  but  cannot 
be  used  for  estimation. 

Example  5.2 

Consider  now  a dynamic  system 

X = -ax  + u + b,  x(0j  » 0 (3,11) 

a is  the  time  constant  and  the  parameter  of  interest,  b is  a • 
bias  from  uncertain  initial  condition.  Let  there  be  noisy  measure- 
ments of  the  state 

y » X + V (3.12) 

V is  white  random  noise  with  power  spectral  density  of  0.01.  Let, 
u»l  0<t<l  (3.13) 

Then 

X » — (1  - e’^^)  (3.14) 


L 


If  the  parameter  b is  neglected  the  estimator  a,  will  be  biased. 

D 

Its  variance,  bias  and  mean-square  error  are 


3.2.2  Choice  of  Parameterization  in  General  Dynamic  Systems 


Consider  a continuous  time  system,  in  which  the  n .x  1 state 
X and  the  p x 1 output  y follow  the  equations 

X - £(x,u,t,0,cp)  ♦ w(t)  C3.17) 

y - h(x,u,t,0,9)  + v(t)  C3.18) 


where  u is  the  q x 1 vector  of  the  deterministic  inputs  and 
w(t)  and  vCt)  are  random  noises.  The  state  equation  and 
measurement  equation  are  parameterized  on  two  sets  of  parameters. 

0 is  the  m^  X 1 vector  of  parameters  whose  value  we  are  inter- 
ested in  estimating.  The  value  of  cp  is  not  known  and  is  not  of 
direct  interest.  Nevertheless,  because  of  its  effect  on  the  equa- 
tions of  motion,  9 must  be  given  due  consideration.  Examples 
of  cp  are  bias  in  instruments,  initial  condition  errors,  para- 
meters governing  coupling  to  other  approximately  independent  modes, 
etc. 

There  are  several  important  questions  which  come  up  at  this 
point.  Can  the  parameters  cp  be  neglected?  Should  some  or  all 
of  them  be  estimated  together  with  0?  Under  what  circumstances 
is  it  better  to  neglect  cp  than  to  estimate  cp?  The  technique 
for  selecting  useful  parametrization  would  answer  these  questions. 

Let  be  the  negative  log- likelihood  function  with  the 

global  minima  at  bhe  parameters  cp  are  estimated 

together  with  0,  the  resulting  values  are  0^^^  and  9^^^.  These 

values  are  asymptotically  unbiased  and  the  covariance  of  the 

estimates  0„  and  9„  is 
m m 
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The  covariance  of  9^^^  can  be  written  explicitly 
cov(9„)  . 

Suppose  that  we  estimate  only  the  required  parameters 
9 and  disregard  the  parameters  9.  Quite  clearly,  the  estimates 
of  9,  say  9^,  will  be  biased.  If  9 are  set  to  zero,  the 
estimates  9^.  will  be  a solution  of  equations 

I5  ce^.o)  - 0 (3.; 


Expanding  the  above  equation  around  the  corresponding  unbiased 
estimate  S’  of  9^.,  we  get. 


>n  p|'\  _ 9'IC9, 

9^C9r*0J  • 


C9,-9) 


+ Higher  Order  Terms 


. (a^j(g  a^J(9,<p]  „ 

rrr-  a<pa9  “p 


Assxming  now  that  the  second  derivatives  of  J are  approximately 


the  same  at  C^.cp)  as  at  (9».,cp  ),  we  get 

m m 


Bias  (9^)  ■ ^12^ 


C3.2S) 


The  covariance  of  the  estimated  9^  is 


cov  (9j.)  - M*J 


(3.26) 


The  mean- square  error  of  the  estimate  9^  is 


MSE  (9^)  - M’J  ^ M[\  M^2 


(3.27) 


Since  9„  is  unbiased,  its  MSE  is  equal  to  its  covariance  of 
m 


Eq.  (3.21).  It  is  clear  that 


MSE  (9^)  > .MSE  (9^) 


E(cp<P^)  i (M22  - ‘'^21‘SK2^ 


(3.28) 


where  the  second  inequality  implies  the  positive-definiteness  of 

.^1 ..  T_  _ T?  /■ T 


the  difference.  Often,  the  difference  between  E(cp9  ) and 
(M22  ■ ^'^21^11^12^'^  indefinite.  Tests  must  be  made  on  indi- 
vidual elements  of  <p  to  determine  which  of  them  should  be  in- 
cluded in  the  identified  model  and  which  ones  disregarded. 

(M22  • M2l‘'^ii^l’^  covariance  of  the  estimation  errors 

in  <p,  if  it  is  estimated. 


This  procedure  for  including  parameters  in  cp  for  identifi- 
cation requires  a priori  information  about  the  possible  variations 
in  the  parameters  9.  This  value  of  9 is  not  intended  to  supply 
any  information  about  the  parameters,  but  is  meant  solely  to 


ki 


direct  us  towards  selection  of  useful  parameterization  in  the 
estimation  of  9. 


What  if  the  range  of  variations  of  cp  is  not  known?  Since 
cp  should  be  smaller  than  the  standard  deviation  of  its  estimation 
error,  it  is  not  possible  to  use  the  estimated  value  of  cp  to 
check  the  inequality  of  Eq.  (3.28)  directly.  An  alternate  proced- 
ure is  proposed  here. 

Let  JC9^|0)  be  the  minimum  value  of  the  negative  log- 
likelihood  function  when  cp  is  excluded  from  identification.  If 
at  this  point  cp  is  included  in  the  minimization,  the  step  in  9 
and  cp,  for  small  cp,  would  be 


9-9^ 
cp  » 0 


C3.29) 


It  is  straightforward  to  show  that  the  decrease  in  the  likelihood 
function  will  be 


X 9 J w w*  Iw  \ * X 9 J 

I 89  ^^22  ■ ^21^11^12^  8^ 


(3.30) 


AJ  may  be  computed  numerically.  It  can  be  shown  that  (-2  AJ)  has 
the  distribution  % Cni2)  under  the  hypothesis  that  9 is  zero 
[1,2].  If  the  reduction  in  J is  significant,  9 or  the  elements 
9 under  consideration  must  be  included  in  the  identification  model. 
For  a 95%  confidence  level,  one  or  more  elements  of  9 must  be 
included  in  the  identified  parameters,  if. 


C-2AJ)  > C : P(x  Cm2)  < C)  » 0.95 


(3. 31) 


Note  that  C does  not  depend  upon  the  order  of  9 or  the  number 
of  data  points,  but  only  on  the  confidence  level  and  m2- 

in  Eq.  (3.30)  consists  of  two  parts:  the  stochastic 

part,  because  of  random  noise  in  the  system,  and  the  deterministic 
part  because  of  errors  in  0 and  cp.  Using  the  fact, 


m) 


(w) 

random 


random 


(3.32) 


it  is  easy  to  show  that 


E(AJ)  = - X " ’^zAl^lZ^ 


(3.33) 


Consider  the  case  where  parameters  in  <p  are  treated  sequentially. 
Then,  at  any  point  in  the  recursive  procedure  m2  * 1 and  9 is 
a scalar.  So, 


ECAJ)  . - 1 - ^ 


(3.34) 


If  we  choose  the  95%  confidence  interval  as  the  criterion  for  in- 
cluding the  set  of  identified  parameters 


P(X  (1)  < 3.84)  - 0.95 


(3.35) 


Therefore,  E(AJ)  will  meet  the  above  criterion  for  including 
9 if 


<P  > 


/I754 


1.68 


C^22-®^21«iPl2) 


(3 . 36) 


i 
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This  criterion,  in  the  expected  value  sense,  comes  close  to  that 
of  Eq . C3 . 28 ) . 


3.2.3  Summary 


The  last  section  presented  a technique  for  selecting  use- 
ful parameterization  in  parameter  estimation  problems.  The  prop' 
erties  of  the  likelihood  function  near  its  optimum  value  were  ap- 
plied and  a quantitative  measure  was  developed  for  inclusion  of 
parameters,  whose  value  is  of  no  direct  interest.  In  stabil- 
ity and  control  derivative  extraction,  these  parameters  usually 
include  initial  conditions,  bias  and  other  errors  in  instruments, 
This  procedure,  when  applied  to  dynamic  systems,  is  the  comple- 
ment of  the  F-test  of  linear  step-wise  regression. 


3.3  MODELING  OF  NONLI.NEAR  .^ND  TIME  VARYING  SYSTEMS  USING  SPLINE 
REPRESENT.ATION 


In  many  .sys-tems , the  physical  processes  are  so  complex, 
that  it  is  not  possible  to  parameterize  the  effects  of  various 
forces  from  a priori  analysis.  An  example  is  the  dynamics  of  a 
high  performance  fighter  airplane  in  the  high  angle-of-attack 
flight  regime.  This  section  develops  a technique  for  the  deter- 
mination of  a representation  of  the  model  of  a nonlinear  or  time 
varying  system  with  unknown  structure  and  the  identification  of 
the  parameters  in  the  representation  using  response  data.  First, 
it  is  necessary  to  select  a representation  which  is  general  enough 
to  include  all  possible  models  within  the  a priori  constraints. 
Multivariable  splines  are  used  for  this  purpose.  A particular  form, 
which  best  describes  the  response  data,  is  then  selected.  Finally, 
the  parameters  of  this  reduced  form  are  identified  by  using 
the  maximum  likelihood  or  another  method. 


iViv 


Section  3.3.1  discusses  splines.  In  Section  3.3.2  we  show 
how  an  unknown  static  function  may  be  approximated  by  splines 
when  the  values  of  the  function  contaminated  by  noise  are  given 
at  discrete  points.  Section  3.3.3  shows  how  nonlinear  systems 
may  be  represented  in  terms  of  the  splines.  In  Section  3.3.4, 
we  consider  as  a special  case  the  problem  of  approximating  the 
control  nonlinearity  in  a first  order  system.  The  general  problem 
of  multiinput/multioutput  systems  with  multivariable  nonlinear- 
ities is  disucssed  in  Section  3.3.S.  Section  3.3.6  gives  an 
example  of  the  identification  of  nonlinear  aerodynamic  deriva- 
tives of  a vehicle.  The  results  are  summarized  in  Section  3.3.7. 

3.3.1  Polynomial  Splines 

Polynomial  splines  are  piecewise  polynomials  in  one  or  more 
variables.  Splines  in  single  variables  have  been  studied  exten- 
sively [25]  and  have  been  applied  to  data  smoothing  and  approxi- 
mation of  known  functions  [26,27].  Study  of  multivariable  splines 
is  quite  recent  and  somewhat  incomplete  [28].  In  this  work,  we 
use  only  polynomial  splines;  therefore,  the  word  spline  will  be 
used  to  mean  polynomial  spline.  A definition  of  polynomial 
spline  in  a single  variable  is  now  given. 

Definition  3.1 

A function,  S fxl  . is  a polynomial  spline  function  of 
m, V ^ ’ 

order  m and  degree  of  continuity  v (m  > v)  in  .xe(a,b)  if, 
and  only  if:* 


Usually  polynomial  splines  are  defined  with  v=m-l.  We  deviate 
from  this  conventional  definition  so  that  it  is  possible  to  get 
better  approximations  for  functions  f(x)  which  are  continuous, 
but  do  not  have  continuous  derivatives  of  high  order.  Sometimes 
such  splines  are  said  to  have  each  knot  of  multiplicity  m-v. 


(1)  There  exists  an  increasing  sequence  of  real  numbers 

a = < X2  < ...  < ^ 

polynomial  of  degree  m or  less  in  x between 
(x^,x^_^2^)>  i®  1,2, ...k. 

(2)  Sjjj  ec'^(a,b),  where  c'^(a,b)  is  the  class  of 

functions  continuous  through  the  vth  differentiation 
over  the  interval  Ca,b). 


Definition  3.2 


x^,X2> . . . are  called  the  knots  of  the  spline  function, 


Example  3.5 

The  following  function  is  considered 


S3^lCx) 


x^  - 4x^  > 2 


5x^  + 2 

-x^  + 8x^  - 5x  + 3 


- 2 < X < 0 
0 < X < 1 


1 < X < 2 


(3.37) 


The  first  and  second  derivatives  of  this  spline  function  are 


3x^  - 8x 


-2  < X < 0 


S3,l(x) 


\ lOx 
( -3x^  + 


0 < X < 1 


1 < X < 2 


(3.38) 


\'a 


s'3;i(x) 


6x  - 8 


6x  + 16 


■2  < X < 0 
0 < X < 1 


1 < X < 2 


(3.39) 


Mote  that  Sj  j^(x)  is  continuous  everywhere,  whereas  Si'^^Cx) 
is  continuous  at  x=l,  but  discontinuous  at  x=0. 

The  parameters  of  a spline  in  a single  variable  are: 

(1)  the  order  of  the  polynomial,  m and  the  degree  of 
continuity,  v; 

(2)  the  number  and  positions  of  knots;  and 

(3)  the  coefficients  of  the  polynomial  terms. 

In  the  identification  of  system  model  from  noisy  data,  it  is  im- 
portant to  select  neither  too  many,  nor  too  few,  knots.  If  there 
are  many  more  knots  than  are  reasonable,  the  spline  function  is 
unnecessarily  complex  and  the  estimates  of  knot  positions  and 
polynomial  coefficients  have  large  errors.  Too  few  knots  may  not 
be  adequate  to  approximate  a nonlinear  function.  In  addition  to 
the  number  of  knots,  the  position  of  the  knots  is  also  important 
as  explained  by  Rice  [29],  "...the  key  to  the  successful  use  of 
splines  is  to  have  the  location  of  the  knots  as  variables."  To 
select  the  number  and  position  of  knots  optimally,  we  propose  us- 
ing a linear  least  square  method,  called  the  subset  regression 
technique  [30].  Linear  regression  techniques  may  be  used,  be- 
cause, in  the  spline  function  approximation,  the  nonlinearities 
are  expressed  as  a sum  of  several  functions,  each  multiplied  by  a 
different  constant.  In  other  words  the  nonlinearities  are  repre- 
sented as  linear  functions  of  unknown  parameters.  Recently,  the 
least  squares  method  has  been  studied  extensively  by  Mendel  who 
developed  efficient  computation  procedures  for  subset  regression 
in  both  the  batch  and  the  sequential  modes  [31]. 


3.3.2  Estimation  of  Static  Functional  Relationships 

Consider  two  scalar  variables  which  are  related  by  an  unknown 
function 


z = fCx) 


a < X < b 


(3.40) 


s 

"A 

m 
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It  is  assumed  that  fCx)  is  a continuous  function  in  x over 
the  closed  interval  [a,b].  Suppose  that  N pairs  of  values 
Cx(i)  , < b and  1 < i < are  given  where  y(t) 

is  a noise  measurement  of  z(i).  The  problem  is  to  estimate  a 
simple  representation  of  the  relationship  between  x and  z from 
the  data.  Since  the  form  of  the  nonlinear  function  is  not  known, 
it  is  approximated  by  a spline  S^^  ^ (x)  with  knots  at 
a = Xj^  < X2  < X-  < ...  ® h.  A general  representation 

for  such  a spline  is  (see  Schoenberg  [32]): 


m 

2 

j*0 


Cl-  X- 


k 

2 

2 = 2 


3X1 


(3.41) 


where 


x^  is  the  truncated  power  function 


.3  = 


jxi 

(o 


X > 0 
X < 0 


(3.42) 


The  spline  representation  of  Eq.  (3.41)  is  very  flexible. 

The  knots  can  be  taken  in  and  out  of  the  splines  by  adding  new 
terms  or  removing  certain  terms.  In  addition,  if  C„ ^ is  zero 
for  j > m'  and  2 = l,2,...,k,  the  spline  S^^^  reduces  to 

a lower  order  spline  S^  ^ Cx) . Thus,  by  selecting  terms  in  Eq. 
(3.41)  appropriately,  a lower  order  spline  or  one  with  fewer  knots 
may  be  obtained.  The  spline  is  a nonlinear  function  of  the  in- 
dependent variable  x,  but  is  linear  in  the  unknown  parameters 
C.  The  linear  least  square  regression  methods  are,  therefore, 
applicable.  When  the  least  square  method  is  used  to  select  a sub- 
set of  the  terms  from  the  equation,  it  is  called  the  subset 
regression  technique. 

The  basic  problem  in  the  successful  use  of  splines  for 
approximating  any  nonlinear  function  is  the  proper  selection  of: 
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(1)  the  order  m and  continuity  v o£  the  splines,  (b)  number 
and  position  of  knots,  and  (c)  the  various  coefficients  . 

We  indicated  above  that  the  subset  regression  technique  may  be 
applied  to  drop  terms  out  of  the  representation  of  Eq.  (3.41) 
starting  from  the  maximal  representation  (with  most  possible 
knots,  highest  possible  order  and  lowest  possible  continuity  v) . 
The  method  for  selecting  the  more  important  terms,  starting  from 
the  maximal  representation,  is  now  described. 


Since  S (x)  is  used  to  approximate  f(x),  Eq.  (3.40)  may 

ill  y V 

be  written  as 


m 

z - 2 


j=0 


■Ij 


k 

2 

2-2 


m 


3-v  + l •' 


(3.43) 


where  e is  the  modeling  error  because  of  approximating  f(x) 
by  .a  spline  function.  Eq.  (3.43)  can  be  written  for  the  given 
data  points  as; 


y(i) 


m 

T 


j-0 


C,.(x(i))^  + I 
2-2 


2 C-  .(x(i)  - x-)j. 
jav  + l ^ 


s (i) 


1 < i < N 


(3.44) 


where 
error. 
C 


e (i)  is  the  sum  of  modeling  error  and  random  measurement 
The  problem  is  to  select  the  fewest  number  of  coefficients 


'2j 

and 


which  will  provide  an  acceptable  difference  between  z(i) 
Sjjj  ^(.x(i))  in  the  sense.  Let, 
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[yCD,  y(2),  yC3),...y(N)] 


e'^  - [til),  eC2),...eCN)] 

X.  - [1,  x(i),  x^(i),  ...,  (x(i)-Xj^);] 

X “ * ^2’*"'* ^ 

Equation  (3. 44)  is  written  as 
y » X9  + e 

0 can  be  estimated  by  using  the  relation 
0 - (x'^X)'^  X^y 


(3. 45) 


however,  a subset  of  the  elements  of  the  vector  0 can  explain 
most  of  the  variation  in  y.  The  subset  regression  method  selects 
this  set  of  parameters  iteratively.  The  procedure  starts  by  in- 
cluding the  variable  most  highly  correlated  with  y.  At  every 
step  F-tests  are  used  to  determine  if  a new  variable  may  be  in- 
cluded in  the  regression  or  if  any  of  the  variables  already  in 
the  regression  may  be  dropped.  The  cut-off  F-value  is  determined 


from  the  F-distribution  based  upon  a certain  confidence  level. 

If  the  cut-off  value  is  higher  than  the  partial  F-value  of  any 
parameter  outside  the  regression,  that  parameter  may  be  included 
in  the  regression.  On  the  other  hand,  if  any  parameter  in  the 
regression  has  a partial  F-value  smaller  than  cut-off  value,  that 
parameter  may  be  dropped.  The  values  of  coefficients  of  the  terms 
which  are  in  the  equation  are  computed  simultaneously.  For  a de- 
tailed discussion  of  the  method,  see  Efroymson  [30]. 


\ .Mi 


process  y.  To  get  a useful  representation  of  the  nature  of  cp, 
it  may  be  written  as  a spline  function  in  terms  of  its  arguments. 
Let  x(t)  be  the  nxl  vector  of  the  arguments  of  the  function 
cp.  Then,  Eq.  C3.48)  may  be  written  compactly  as: 

y(t)  » cp(xCt))  + w(t)  C3. 


The  theory  of  general  multivariable  spline  functions  is  not 
known  because  of  several  conceptual  problems.  One  major  problem 
is  that  splines  over  general  multidimensional  knot  structures  are 
not  well-defined.  In  the  present  application,  however,  the  knot 
structure  may  be  chosen  arbitrarily.  It  has  been  shown  by  .Arthur 
[23 ] that  a well-behaved  polynomial  spline  may  be  defined  over  a 
multidimensional  rectangular  grid  which  will  be  called  the  net. 

To  represent  a multidimensional  function  f(x),  x R^,  in  terms  of 
splines  over  the  rectangular  nets  (also  called  tensor  product 
splines),  the  space  is  divided  into  rectangles  by  lines 


^1  " 5 11*  ^12’  ' * • ’ ^ 


im 


^2  “ ^21’  ^22’  ' ' * ’ ^2p. 


^n  ’ ^nl’  ^n2»  * ' * ’ ^np^ 

A spline  or  order  m in  n variables,  with  continuous  deriva- 
tives up  to  order  m-1  can  be  written  as 


s_  _ , (x)  ■ j.  t i •• 

jj.jj Jn"0  I Jn  1-1 


n Ji 

n X: 


Pi  P2  P3  Pn  n „ 


*^•1  *n"^' 


This  spline  has  a fixed  polynomial  representation  in  any  of  the 
multidimensional  rectangles  and  has  continuous  derivatives  up  to 
order  (m-1)  in  each  variable  at  the  boundary  of  the  rectangle. 
The  definition  can  be  extended  when  the  spline  has  continuous 
derivatives  of  order  ra-v.  For  a general  nonlinear  system  of 
Eq • C3.48),  the  above  representation  has  many  unknown  constants 
even  for  small  M and  p^.  To  obtain  useful  representation  such 
that  consistent  estimation  is  possible,  it  is  necessary  to  select 
the  more  important  terms  from  the  general  spline  representation. 


The  complexity  in  estimating  cp  depends  upon  the  number  of 
arguments  in  the  unknown  functions  and  the  domain  of  each  argu- 
ment. Therefore,  to  simplify  this  problem,  all  known  a priori 
information  about  the  system  must  be  used.  The  function  cp  must 
be  reduced  to  the  simplest  form  before  attempting  to  approximate 
unknown  nonlinearities.  For  example,  the  physics  of  the  process 
may  dictate  that  the  function  cp  is  a sum  of  nonlinear  functions 
of  the  individual  terms  yCt-i),  uCt-i),  i*l,2,...,M.  Then, 

Eq.  (3.48)  simplifies  to 


y(t)  » Z {9iCy(t-i))  + i/.  (u(t-i))  w(t> 
i-1  ^ ^ 


All  unknown  nonlinear  functions  will  be  approximated  by  splines 
(e.g.,  in  Eq.  (3.52),  each  cp^  and  will  be  written  as  indi- 

vidual spline  functions)  . 


In  nonlinear  systems,  defined  by  unknown  functions,  it  is 
assumed  that  the  behavior  of  the  system  in  distant  regions  is  sig- 
nificantly different.  Therefore,  from  a given  set  of  data,  it  is 
possible  to  estimate  a representation  of  system  behavior  only  over 
the  region  covered  by  the  data.  In  contrast,  the  behavior  of  lin- 
ear systems  is  the  same  everywhere;  hence,  a set  of  data  provides 
the  same  information  about  its  behavior  in  any  part  of  the  phase 
space.  The  rectangular  net  of  Eq . (3.50)  must  take  this  into  con- 


sideration  by  not  selecting  any  knots  in  regions  where  there  is 
no  data. 


Starting  from  the  general  spline  representation,  we  develop  I 

I 

techniques  for  choosing  important  terms  which  describe  the  nonlin-  | 

ear  behavior  of  the  system  in  different  regions.  The  next  section 
illustrates  the  method  for  a first-order  system  with  control  non- 
linearity. The  general  multivariable  systems  are  discussed  in 
Section  3.3.5, 

* 

3.3.4  Modeling  of  a Single  Input,  Single  Output  System  With  a j 

Control  Nonlinearity  ) 

To  illustrate  the  technique,  we  discuss  in  this  section,  a 
system  with  only  a single  unknown  function  in  one  variable. 

Consider  a single  input,  single  output  system  in  which  the  scalar 
output  y(t)  follows  the  equation  ’ 


M M 

yCt)  - I a^y(t-t)  2 b^g(u(t-t))  + w(t),  t>M  (3 

t ■!  t »1 

The  control  input  enters  the  system  equation  in  a nonlinear  man- 
ner. It  is  assumed  that  the  function  g does  not  depend  upon 
the  delay,  r , and  the  noise  w(t)  is  white.  The  problem  is  to 
find  a good  approximation  for  the  nonlinear  function  g(*)  given 
yCs) , u(s) , s » 1,2, . . . ,N. 

The  only  unknown  and  nonlinear  function  in  the  above  repre- 
sentation is  gC*)*  Suppose  that  u varies  between 
u for  s * 1,2,...,N.  Then,  from  the  given  data,  we  may 

estimate  gCu)  for  u„.  < u < u^_^.  The  function  g(u)  is 

^ niin  max 

written  as  the  maximal  spline  function  with  knots  such  that 

^min  ‘ h ^ ^ ^k+1  ’ %ax' 
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Techniques  of  multiple  regression  can  be  applied  to  determine 
which  coefficients  should  be  included  in  Eq.  (3.56)  to 

represent  the  control  nonlinearity. 

If  at's  are  not  known,  an  expression  similar  to  Eq.  (3.56) 
can  be  used  to  determine  the  important  terms  • y(t) 

is  the  first  regressed  on  y(t-t)  followed  by  a multiple  regres- 
sion as  described  above.  If  b^'s  are  not  known,  an  iterative 
procedure  must  be  applied,  b^ ’ s are  first  estimated  by  assuming 
a certain  form  for  the  function  g(u) , usually  based  upon  a priori 
information  (e.g.,  linear).  These  values  are  used  in  Eqs.  (3.56) 
and  (3.58)  to  determine  important  terms  in  the  spline  represent- 
ation of  the  nonlinearity.  This  representation,  in  turn,  is 
used  to  improve  estimates  of  parameters  b . This  process  is 
repeated  until  the  convergence  occurs.  Though  it  has  not  been 
proven,  this  procedure  seems  to  have  good  convergence  properties 
in  practical  applications,  as  long  as  the  initial  representation 
for  the  function  g(u)  is  reasonable. 

An  example  is  now  presented  to  demonstrate  the  technique. 

Example  3.4 

Consider  a first-order  system 

x(t+l)  = ax(t)  + bg(u(t))  + w(t) 
t » 1 , 2 , ...  100 

**  0 (3.59) 


Let  a » 0.9  and  b » 1.0.  g represents  a nonlinearity  of  the 
saturation  type 


dene 
near 
g th 


The  function  g( 
case  (b)  the  form  of 
gCu)  comes  quite  do 
a possible  ten  knots  ^ 
The  assumption  of  whi 
The  methods  for  selec 
of  spline  functions  d 
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where  t^,  are  the  knots  and  are  matrices  o£  the 

same  site  as  A . B is  expanded  in  a similar  fashion.  Then 
the  subset  regression  technique  may  be  applied  to  obtain  a useful 
spline  approximation  of  each  of  the  matrices. 


Linearized  Reoresentation  of  a Nonlinear  Svstem 


The  above  method  can  also  be  successfully  applied  to  non- 
linear systems  to  determine  a linearized  representation.  For 
example,  the  general  nonlinear  system  of  Eq.  (3.49)  may  be  written 


dy(t) 


* HJifrrr 


+ Noise 


(3.68) 


Let , 


dy(t-T  ) “ \ “u(t-T  ) * 


(3.69) 


Then , 


dy(t)  = Z { A (t)  dy(t-t)  + B (t)  du(t-t)  } 
r =1  ‘ 


+Noise 


(3.70) 


If  A^  and  B^  do  not  change  too  rapidly,  the  above  equation  can 
be  approximated  as 


Ay(t) 


* Z { A (t)  Ay(t-r  ) + B.  (t)  Au(t-r  ) } 

t =.1  ^ 


-t-Noise 


(3.71) 


Ay(t-T ) = y(t-t  +1)  - y(t-T ) 
Au(t-t  ) = u(t-T+l)  - u(t-t) 


(3.72) 


In  the  representation  of  Eq.  (3.71),  (t)  and  B (t)  can  be 

determined  as  functions  of  t as  described  previously.  The 
linearized  representations  of  systems  are  useful  in  many  applic- 
ations (e.g.,  stability  analysis,  control  design,  etc.). 

Systems  Nonlinear  in  One  Variable 

Many  systems  are  linear  except  some  elements  of  the  coef- 
ficient matrices  depend  upon  a state,  a control  or  a known 
function  of  some  states,  controls  and  time.  The  output  y is 
written  as 

m 

y(t)  = 2 {■  A fa)  y(t-r)  + B.  (a)  u(t-r)}  + w(t)  (3.73; 

t=l  ^ 

where 


a = f (x.(t)  , u(t)  , t) 


(3.74) 


In  this  case  also,  the  technique  for  time  varying  systems  may  be 
applied. 

3.3.6  Application 

The  above  procedure  has  been  implemented  using  the  multiple 
regression  technique  of  Hall  and  Gupta  [12] . It  is  applied  to 
estimate  the  nonlinear  aerodynamic  behavior  of  a vehicle  from 
flight  data. 

The  simplified  system  has  two  states  (pitch  rate,  q,  and 
angle-of-attack,  a)  and  two  inputs  (flap  deflection,  5,  and 
Mach  number,  \x)  . The  equations  of  motion  can  be  written  as 


a a ZCct,^l,5) 


4* 


q 


M(a,|a,6)  + M^q 


(3.75) 


There  are  noisy  measurements  of  a,ot,  and  q and  the  two  controls 
q is  obtained  by  fitting  a cubic  spline  to  q.  The  problem  is  to 
determined  M^(t)  and  Z^Ct)  for  design  of  the  control  system.  As 
described  in  Section  V,  Eq.  (3.75)  may  be  written  as 


Aa  = Z^(t)  Aa  + Z^(t)  Au  + Zg(t)  A6  + ^ 

Aq  a Mjj(t)  Ac  + ^^^(t)  Au  + Mg(t)  AS  + M Aq 


(3.76) 


Time  histories  of  a,p.,5,  and  q are  shown  in  Figure  3.4,  M (t)  , 


M (t)  and  Mg(t)  are  written  in  terms  of  spline  functions  with 


knots  equally  spaced  at  0.1  second  intervals.  The  multiple  regres- 
sion technique  is  used  to  estimate  the  important  terms  in  the 


spline  representation.  M^(t)  was  identified  to  be; 


Ma(t)  = -0.80  + 3.14(t-0.5)^  0. 2612Ct-l.  5)^ 


266.2(t-1.9) J -804.8(t-2.0) J ^ 1918[t-2.1)^ 


(3.77) 


This  function  is  plotted  in  Figure  3.5.  The  behavior  of  M (t) 
is  quite  close  to  what  is  expected  from  wind  tunnel  information, 


3.3.7  Summary  and  Conclusions 


This  section  discusses  the  application  of  polynomial  splines 
in  approximating  unknoiim  nonlinearities  and  time  variations  in 
dynamic  systems.  In  the  first  step  of  the  procedure,  all  unknown 
functional  relationships  are  represented  by  general  splines  of 
high  order  and  with  many  knots.  Thereafter,  the  subset  regression 
techniques  are  applied  to  select  the  more  appropriate  knots  and 
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drop  terms  of  high  order  if  they  are  not  important  in  explain- 
ing the  observed  response.  This  results  in  the  best  polynomial 
spline  which  approximates  any  nonlinearity.  Techniques  other 
than  subset  regression  may  be  applied  in  selecting  the  more 
pertinent  terms. 

Two  examples  are  presented  to  show  the  applicability  of 
the  above  technique  to  SISO  systems  with  simple  control  nonlin- 
earities and  to  MIMO  systems  with  multivariable  nonlinearities. 
In  both  cases,  excellent  results  are  obtained  even  when  the 


40 


m 


assumption  of  white  noise  was  violated.  It  is  shown  that  the 
general  polynomial  splines  of  multivariable  function  over  a 
rectangular  net  often  involves  too  many  terms.  This  stresses 
the  importance  of  including  a priori  information  about  the 
structure  of  the  system. 


The  spline  functions  are -successful  because  of  some  of 
their  excellent  properties.  Some  of  these  are; 


The  splines  can  closely  approximate  any  single  valued 
function  in  many  variables. 


The  behavior  of  a spline  function  in  one  region  has 
very  little  influence  on  its  behavior  in  another 
"distant"  region. 


The  knot  structure  of  the  fitted  spline  provides  in- 
formation on  where  greatest  variation  in  a function  is 
to  be  expected  and,  if  this  variation  can  be  esti- 
mated. 


(4)  Spline  representations  often  provide  an  insight  into 
the  physics  of  the  system. 


The  above  procedure  is  proposed  to  determine  an  approximate 
representation  for  the  unknown  nonlinear  relationships.  Final 
estimates  of  the  various  parameters  in  the  spline  representation 
will  be  done  by  using  a more,  efficient  and  unbiased  estimation 
method  like  the  maximum  likelihood  and  a computationally  better 
conditioned  form  of  the  polynomial  splines,  called  the  B-splines 
[2S]. 
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IV.  INPUT  SELECTION  TECHNIQUES  FOR  PARAMETER 
IDENTIFICATION  AND  MODEL  DISCRIMINATION 


The  accuracy  with  which  a useful  model  structure  is  isolated 
and  values  of  various  parameters  are  estimated  depends  upon  three 
factors:  (1)  the  input  used  to  conduct  the  experiment,  (2)  the 

kind  and  quality  of  the  measurements,  and  (3)  the  data  proces- 
sing algorithm.  A proper  selection  of  input  is  required  for  more 
efficient  utilization  of  the  test  time  and  for  obtaining  more  ac- 
curate results  by  processing  a certain  amount  of  data.  In  nonlin- 
ear problems,  where  the  models  are  more  complex  and  often  unknown, 
this  requirement  becomes  even  more  important.  This  has  been  dem- 
onstrated quite  successfully  by  Hall  and  Gupta  in  Ref.  12. 

The  importance  of  choosing  appropriate  control  inputs  and  ex- 
citing specific  aircraft  modes  for  extracting  stability  and  control 
derivatives  from  aircraft  flight  testing  has  long  been  recognized. 
Good  inputs  could  resolve  parameter  identif lability  problems  and 
improve  confidences  on  estimates  of  stability  and  control  deriv- 
atives obtained  from  the  resulting  flight  test  data.  In  other 
words,  with  specially  chosen  inputs- the  same  accuracy  on  para- 
meter estimates  can  be  obtained  in  much  shorter  flight  test 
time  than  with  conventional  inputs.  Shorter  flight  tests  can 
lead  to  a saving  in  time  required  for  stability  and  control 
testing  and  the  computation  requirements  for  extraction  of  aero- 
dynamic derivatives.  In  addition,  these  inputs  can  be  chosen 
specifically  to  satisfy  the  ultimate  flight  test  objectives  such 
as  control  systems  design,  simulator  parameter  specifications, 
response  prediction,  aerodynamic  model  validation,  or  handling 
qualities  evaluation.  Previous  work  on  the  selection  of  para- 
meter identifying  inputs  was  discussed  in  Section  II. 
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There  are  several  methods  of  testing  nonlinear  systems.  One 
method  is  to  take  several  operating  points  and  estimate  an  approx- 
imate linearized  model  around  each  operating  points.  This  re- 
quires inputs  which  produce  small  perturbations  around  the  oper- 
ating point,  such  that  the  approximation  of  a linear  model  are 
valid.  The  techniques  for  designing  such  inputs  are  given  in 
Section  4.1  and  practical  solution  techniques  are  described  in 
Section  4.2.  The  inputs  which  provide  best  model  discrimination 
are  described  in  Section  4.3.  The  summary  and  conclusions  are 
described  in  Section  4.4. 


4.1  INPUT  DESIGN  FOR  PARAMETER  IDENTIFICATION  OF  LINEARIZED 

MODELS  AROUND  AN  OPERATING  POINT  --  THEORETICAL  BACKGROUND 


4.1.1  Problem  Discussion 

In  linear  operation  regimes,  the  dynamic  behavior  of  aero- 
dynamic and  hydrodynamic  vehicles  can  be  described  by  state  space 
equations 


~ xCt)  = FxCt)  + GuCt)  0 < t<  T C4.1) 

dt  ■ “ “ 


X is  the  n X 1 state  vector  Ce*g*»  linear  and  angular  position 
and  rates)  u is  the  q x 1 input  vector  (e.g.,  hydrodynamic 
surface  deflections)  and  F and  G are  matrices  which  represent 
aerodynamic  and  gravitational  force  coefficients  and  certain  kine 
matic  relationships.  The  vehicles  are  instrumented  to  measure 
certain  response  variables  which  are  linear  functionals  of  the 
state  variables.  The  instruments  have  bias  and  random  errors, 
therefore,  the  p x 1 output  y is  written  as 


y(t)  » Hx(t)  + b + vCt) 


(4.2) 
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b is  the  bias  vector  and  v(t)  is  a white  Gaussian  noise  vector 
v(t)  has  zero  mean  and  power  spectral  density  R.  In  general,  ] 

may  also  depend  on  some  aerodynamic  Cor  hydrodynamic)  oarameters. 
Examples  of  F,  G and  H are  given  in  the  next  section. 


Let  a denote  the  vector  of  unknown  aerodynamic  parameters. 
It  may  be  estimated  from  measurement  of  y(t)  and  u(t).  The 
problem  is  to  select  an  admissible  input  time  history  which  pro- 
vides the  best  estimates  of  the  aerodynamic  parameters  based  upon 
some  criterion.  Because  of  stresses  and  other  considerations,  the 
outputs  are  often  constrained.  Therefore,  it  is  necessary  to  put 
an  upper  bound  on  a quadratic  function  of  the  state  and  the  input, 


1 . e . , 


jj  (x’^W^x  + u’^W^u)  dt  < E 


(4.3) 


4.1.2  Mathematical  Criteria  for  Innut  Desian 


A quantitative  measure  of  the  knowledge  about  a certain  set 
of  parameters  in  a given  response  is  given  by  the  information  ma- 
trix. If  .M  is  the  information  matrix  for  the  entire  set  of 
parameters  (aerodynamic  parameters  a and  unknown  biases  b) , it 
was  shown  by  Cramer  and  Rao  [33]  that 


Cov(0-0)  > M'"^  A D 


(4.4) 


irrespective  of  the  data  reduction  algorithm.  0 is  the  true 
value  (unknown)  and  0 the  estimated  value  of  the  parameters.  In 
input  design  procedures  discussed  in  the  next  section,  it  is  as- 
sumed that  an  efficient  identification  algorithm  is  used  such  that 
all  information  about  the  parameter  is  extracted  from  data.  The 
input  design  criterion  can  then  be  defined  in  terms  of  the  infor- 
mation matrix.  This  assumption  is  important  because  it  uncouples 
the  input  design  procedure  and  the  parameter  identification  algor- 
ithm. 


It  :.s  usually  not  possible  to  find  an  input  which  gives  bet- 
ter estimates  of  all  aerodynamic  parameters  than  any  other  input 
for  a given  operation  condition  of  the  vehicle.  Therefore,  scalar 
criteria  have  to  be  derived  from  the  information  matrix  M.  Since 
we  are  interested  in  the  estimation  of  aerodynamic  parameters  only, 
it  is  necessary  to  consider  parts  of  information  and  dispersion 
matrices  corresponding  to  o.  Call  them  ^ and  respec- 


matrices  corresponding  to  o.  Call  them  ^ 
tively.  Clearly 


» [I  ! 0] 
1 


M ri' 

.0. 


0*^“^  » [I  1 0]  m'^  ri 


where  I are  q x. q identity  matrices. 

Though  several  criteria  based  upon  the  information  and  dis- 
persion matrices  have  been  discussed  in  the  literature  [16]  , three 
of  them  are  particularly  useful  for  estimation  of  vehicle  para- 
meters ; 

Cl)  linear  functional  of  ^ 


» max 


C4.5) 


C2)  the  determinant  of  the  dispersion  matrix,  D 


(4.6) 


(3)  linear  functional  of  the  dispersion  matrix, 


J-  » min 


(4.7) 


!jB  is  such  that  for  two  positive  semi-definite  matrices  A and  B 
and  a constant  c 


Ca)  ^CA)  ^ 0 

(b)  IF  (A  + B]  » ^-CA)  + IF(B) 

(c)  iF(cA)  * ciFCA)  (4.8) 

Examples  of  linear  operator  IF  are  the  trace  and  the  weighted 
trace , 

maximizes  the  total  or  partial  sum  of  information  of  all 
aerodynamic  parameters  or  linear  combinations  of  parameters.  This 
may,  however,  lead  to  an  almost  singular  information  matrix  imply- 
ing a dispersion  matrix  with  large  diagonal  terms.  As  such  this 
criterion  is  unsuitable.  It  is  useful  because  it  arises  as  a sub- 
problem in  other  cases. 

The  positive  definite  dispersion  matrix  can  be  considered  to 
be  the  hyperellipsoid  of  uncertainty  in  the  parameter  space.  J2 
works  with  the  determinant  of  the  dispersion  matrix  and  minimizes 
the  volume  of  the  uncertainty  ellipsoid.  The  advantage  of  this 
method  is  that  it  is  independent  of  the  units  of  the  parameters 
and  implies  optimality  of  a prediction  error  criterion,  see  Mehra 
[17]. 

Jj  minimizes  a weighted  stun  of  covariances  of  parameter  es- 
timates (or  some  linear  combinations  of  parameters).  The  weight- 
ing matrix  serves  two  purposes.  Since  the  covariances  of  differ- 
ent parameters  have  different  units,  it  converts  each  term  in  the 
sum  to  the  same  units.  Secondly,  the  weighting  matrix  offers 
tremendous  flexibility  because  it  is  possible  to  assign  varying  im- 
portance to  parameters,  through  weights  on  their  nondimensional 
covariance.  This  is  considered  to  be  one  of  the  most  suitable  per- 
formance criteria  since  it  works  with  parameter  estimate  covari- 
ances directly. 


4.1.3  Optimal  Input  Desi< 


This  section  develops  two  techniques  for  input  design: 

Cl)  the  time  domain  approach,  which  gives  a time  history  of  the 
optimal  input,  and  (2)  the  frequency  domain  approach,  which  gives 
the  spectrum  of  the  optimal  input.  We  derive  equations  for  the 
information  matrix  and  then  show  algorithms  to  compute  the  optimal 
input . 


4. 1.3.1  Time  Domain  Approach 


Information  Matrix 


The  information  matrix  for  parameters  a and  b,  defining  a 
as  b element  of  the  vector  9,  is  given  by 


(4.9) 


For  simplicity,  explicit  time  dependence  is  not  shown  where  pos- 


sible. Eq.  (4.9)  can  be  written  as 


(4.10 


where 


mC“)  . /’’■  ^-1  3(Hx) 

J Q da  da 

M(ctb)  . /"T  ^-1 

Jo 


=»  tr’^ 


(4.11 


and  A^Ct)  are  derived  in  Ref.  16  and  are 


X f,  (A,^(t.OR‘^ 


supCr  ,s) 


{A.^(t,t)R  "Ct)A.(t,s) 


+ A J(t,t)R'^(t)A.  (t,s)}  dt 


It  ^ H9(t,t)G.}  dt 


The  constraint  of  Eq.  (4.3)  can  be  written  as 


n ' 

Jo  J 0 


(t)  Q(T  ,s)u(s)dT:ds  < E 


/T 

G>"^(t,  r)W,cp(t,s)Gdt 


sup  (t  , s ) 


(4.16) 


where  sup(r,r')  is  the  larger  of  t and  x'  and  6 is  the  Dirac 
delta  function.  Since  Q(t,s)  is  a positive  semidef inite  function 


'JH  V^'.  TV”  I mj 


the  unequality  sign  in  Eq.  (4.16)  can  be  converted  into  an  equality 
sign.  In  addition,  E may  be  taken  as  one.  The  optimal  inputs 
can  be  scaled  if  the  total  energy  is  different  from  one.  From 
Eqs . (4.10)  and  (4.15) 


T / T q 

/ 2 (M..(t,s) 

J 0 i,j=l 


- Al(r)R  A.  (s)  }u.  (t ) u-  (s)drds 

T J ^ J 


which  is  a quadratic  function  of  u,  like  The  methods 

which  are  described  in  this  section  are  applicable  as  long  as 

this  quadratic  relationship  e.xists.  For  the  sake  of  simplicity 

in  this  section,  however,  it  is  allowed  that  there  is  no  bias 

error  in  the  measurement.  Then  .M  and  are  the  same. 

'’a ) 

c-Trtm  nr>Tj  nn  vf  vf '•  J used  interchangeably. 


From  now  on,  M and  M''  are  used  intei 
.Maximizing  a Linear  Functional  of  M 


From  Eqs.  (4.5)  and  (4.15),  the  cost  is 


rT  rT  q 

’ Jq  -^0  i ,s))ui(r  )Oj  (s)drds 


f f u'^(t)P(r  ,s)u(s)di( 
•^0  0 


(4.17) 


P(:,s)  is  a symmetric  positive  definite  matrix.  The  cost  func- 
tion, as  well  as  the  constraint,  is  a quadratic  function  of  the  in- 
put. This  optimization  leads  to  the  linear  eigenvalue  problem 


P(t,r")u(:")dr'  = X 


Q(t,  r')u(:')d:' 


(4.18) 
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The  optimal  input  is  the  eigenvector  corresponding  to  the  maximum 
eigenvalue.  Efficient  computation  methods  to  solve  the  equation 
are  discussed  in  in  the  next  section. 

Minimization  of  the  Determinant  of  D 

The  following  result  is  proved  in  .Appendix  A. 

Statement  4.1 

A necessary  and  sufficient  condition  that  an  input  u*(t) 
mini-mizes  |D|  is  that  u*(t)  is  an  eigenvector  of  the  following 
equation  corresponding  to  its  maximum  eigenvalue 


r 

^ n 


P*(t,s)  u(s)ds 


,-l 


Q(^,s)  u(s)ds 

0 


Pij  C',s)  » TrCM*  M^j(r,s))  (4.19) 

and  refers  to  the  optimal  input 


The  maximum  eigenvalue  is 


m 


Eq, 


(4.19)  cannot  be  solved  directly  because  P*(r,s)  is  a 
function  of  the  optimal  input.  This  is  a general  nonlinear  prob- 
lem, which  bears  a close  resemblance  to  Eq.  (4.18),  if  the  lin- 
ear operator  ^ in  Eq.  (4.17)  is  selected  as 


^(•) 


Tr(M*'^(-)) 


(4.20) 


^^1 


Fortunately,  an  iterative  scheme  which  converges  to  the  optimum 
input  has  been  developed. 


■Algorithm  4 


(1)  Select  an  input  Uf,(t)  which  satisfies  the  constraint 


of  Eq. 
Mo- 


(4.16)  and  gives  a nonsingular  information  matrix, 


(2) 

Compute 

W 

(3) 

Compute 

the 

function 

of 

based  unon  M 


m 


0- 

which  maximizes  the  linear 
jn  matrix  defined  by  Eq.  (4.20). 
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(4j  Find  a p and  ^2  such  that  ’^Pi'^0*-^^  2'^m*-^^ 

gives  the  least  lD[  under  the  constraint  of  Eq.  (4.16) 
Jt  is  clear  from  the  steps  in  the  proof  of  Statement  4.1 
that  it  is  always  possible  to  find  , P2  which  re- 
duces the  cost. 


(5)  Check  if  the  input  has  converged.  If  not,  return  to 
step  (b)  . 


The  proof  of  convergence  of  this  algorithm  is  straightforward  and 
is  not  given  here. 


Minimization  of  a Linear  Functional  of  D 


Statement  4.2 


A necessary  and  sufficient  condition  that  an  input  u*(t) 
minimizes  iF(D)  is  that  u*(t)  is  an  eigenvector  of  the  follow- 
ing equation  corresponding  to  its  maximum  value 


P*(z  ,s)u(s)ds  = A 


Q(r  ,s)u(s)ds 


Pj  (t,s)  = j Cr,s)M*’^) 


(4.21) 


The  maximum  eigenvalue  is  i2’(M*'^). 


The  proof  is  similar  to  the  proof  for  Statement  4.1  and  is 
not  repeated.  This  again  is  a nonlinear  problem  and  cannot  be 
solved  directly.  Algorithm  4.1  can  be  used  with  the  linear  func- 
tional in  step  (c)  defined  by  Eq.  (4.21). 


4. 1.3. 2 Frequency  Domain  .Approach 


If  the  duration  of  the  test  is  long  compared  to  the  time 
constant  of  the  system  and  the  system  is  stable,  frequency  domain 
techniques  can  be  used.  In  frequency  domain,  the  optimal  inputs 
can  be  computed  more  efficiently  because:  (1)  the  optimal  input 


has  a discrete  spectrxim  with  finite  frequencies  [17],  and  (b) 
the  response  of  a linear  system  at  different  frequencies  is 
additive  and  independent.  Instead  of  working  with  the  time 
domain  description  of  the  system,  the  transfer  function  which 
describes  the  relation  between  the  input  and  the  output  for 
different  parameter  values  at  all  frequencies.  For  the  system 
of  Eqs . (1.1)  and  (4.2) 


y(o3)  » H(jcoI  - F)  G u(co)  + b6(co)  + v(co) 


(4.22) 


If  T(co)  ^(jcul  - F)  , the  various  blocks  of  the  information 

matrix  per  unit  time  are  easily  shown  to  be 


ij  • / Tr 


,(ab) 


1 ^ -1 
Tk  ^uu 


S ’^(0) 
Ztc  uu  ^ 


(4.23) 


where 


B.  . (co) 

13 


a T*  (CO : 


(4.24) 


Fuu(“)  is  spectral  density  of  the  input,  is  complex 


conjugate  and  u is  the  constant  part  of  the  optimal  input.  It 


is  clear  that 


,(ab) 


can  be  made  zero  by  choosing  u^  equal  to 


zero  (i.e.,  by  making  the  optimal  input  to  have  zero  power  at  zero 
frequency) . Then  the  information  matrix  becomes 


(4.25) 


Ivf 

I 


whose  inverse  is 


i 


M 


-1 


0 


C4.26) 


So  the  biases  in  measurements  have  no  effect  on  • It  is 

useful  to  effect  this  decoupling.  Then  can  be  discarded 

and  M and  used  interchangeably.  In  what  is  to  follow 

this  decoupling  will  be  assumed  and  the  optimal  input  will  have 
zero  power  at  zero  frequency. 


The  constraint  of  Eq.  (4.3),  in  frequency  domain  is 


TrCG'^(.j(oI-F'’^)'V3_(jojI-F)'^G  + W2) 

.(4.27) 

or 

^ J Tr(A(co)dF^^(cc))  » E/T  ' (4.28) 

•00 

The  average  power,  E/T  , may  be  taken  to  be  one.  The  following 
result  can  be  proved: 

Statement  4.3 

X necessary  and  sufficient  condition  that  the  input  u with 
spectrum  F*^  minimizes  |D|  (Tr(WI3))  is  that  the  maximum 

eigenvalue  A(o3)  of  the  following  equation  be  m(.2'(D*)) 

B(co)  » X(co).^(co)  (4.29) 
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where 


BCw)  * S to  min  |D 

i.j-l 


Z (WD*^)..  B..(o))  to  min  Tr(WD) 
i.j-1 


(4.30) 


The  proof  of  this  statement  is  not  given  because  it  follows  the 
proof  of  Statement  4.1  (described  in  Appendix  A).  The  following 
algorithm  can  be  used  to  compute  the  optimal  input  to  minimize 
I D 1 or  Tr(WD) . 

j\lgorithm  4.2 

(1)  Start  with  an  input  with  spectrum  such  that  the  cor 

responding  information  matrix  Mq  is  nonsingular. 

Compute  .^(cc)  and  B^^  (co)  (i,  j-  = l,  2 , . . . ,m)  . 

(2)  Based  upon  the  information  matrix,  compute  B(co)  • 

(3)  Find  co  for  which  .the  biggest  eigenvalue  of 


B ( cc)  ■ A ( co)  A ( (o) 


(4.31) 


is  maximum.  Note  that  A(co)  and  B(co)  are  Hessen- 
berg  matrices,  so  that  all  the  eigenvalues  of  the 
above  equation  are  real.  If  is  greater  than 

m (if  I D 1 is  to  be  minimized)  or  Tr(WD)  (if  TR(WD) 
is  to  be  minimized)  continue  to  the  next  step;,  other- 
wise, stop. 

(4)  Let  normalized  eigenvector  (viz. 

'^max^“^  A(co)\|/jjjax * 1)  corresponding  to  the  maxi.mum 


eigenvalue  A 


In  general,  (co)  is  complex. 


Update  the  design  as  follows 


V * ' ’■'  m * ^ -Ji  ' J.  . ■‘-ii  ■*  ' 


m. 


yj 


- “k’V“k  1'SaxC''’ 


(4.32) 


and  choose  by  a one-dimensional  search.  Return 
to  step  Cb)  . 

Most  of  the  computations  in  the  above  algorithms  are  straight- 
forward. Therefore,  this  algorithm  requires  much  less 
computation  time  than  the  corresponding  time  domain  algorithm. 

In  step  (c)  the  eigenvalue  of  a q x q matrix  equation  has  to  be 
determined  at  several  values  of  co.  This  research  can  usually 


be  restricted  to  C0»<»5  » where 

max 


00. 


max 


is  related  to  the  time 


constant  of  the  system.  In  step  (d) , a Fibonacci  search  is 
applied  because  [ pj  and  TrC<VD)  are  convex  functions  of  c 


This  algorithm  specifies  not  only  the  optimal  spectrum  of 
each  of  the  inputs,  but  also  the  various  components  of  the  cross - 
spectra.  Since  the  optimal  input  has  discrete  spectra,  it  can  be 
realized  by  a sum  of  sine  waves.  It  can  be  approximated  by  using 
binary  and  other  inputs.  If  the  optimal  input  is  implemented 
with  a sum  of  sine  waves,  the  cross-spectra  terms  give  the  phase 
relationship  between  the  sine  waves  in  various  inputs. 

1.1.4  A Suboptimal  Time  Domain  Input 

The  computation  of  the  optimal  input  in  time  domain  is  quite 
complex  because  it  requires  minimization  over  a function.  A sub- 
optimal  input,  which  approximates  the  optimal  input,  can  often 
be  selected  by  the  procedure  described  in  this  section.  The  sub- 
optimal  input  is  expressed  as  a linear  combination  of  some  basic 
input  signals. 


N 

u(t)  * Z a.5Ci,t) 
i = l ^ 


C4.53) 


are  some  preselected  function  like  steps,  sines  and  cosines, 
etc.  The  coefficients  a^  may  be  selected  to  obtain  the  best 
identifiability  of  parameters. 


'A 


% 


di 

M 

.1^ 


The  constraints  on  the  coefficients  a.  because  of  the 

1 


quadratic  constraint  on  the  input  and  the  output  may  be  written 
from  Eq.  (with  E assumed  to  equal  one). 


rT  .T  N 

J J ^ ^i? 

*'0  •'0  i»l  ^ 


Ci,t)QCr,s)  S a.^Cj,s)dt  ds  » 1 
j-1  J 


A^O’a  = 1 


(4.34) 


where 


T T 

“ f y*  §^(i,r  )Q(t  ,s)§  (j  ,s)dt  ds 


Similarly,  the  information  matrix  for  the  aerodynamic  parameters 


M » S a.a.M^^ 


(4.35) 


where  the  cross  information  matrix  defined  as 


^ r r 2 M^(t  ,S)?,  (i,T)§  Cj.s)dtds 
•'O  •'O  k,2=l  ^ ^ ^ 


The  following  algorithm  may  be  used  to  minimize  | D | , by  selecting 


the  coefficients  a^  . 


Algorithm  4.3 


The  following  procedure  converges  to  the  global  optimum: 

(1)  Compute  and  store  Q and  for  i,  j = l,2  , . . ..V  . 

(2)  Select  a vector  a.^  such  that  the  corresponding 
information  matrix  Mq  is  nonsingular. 

(3)  Compute  an  N x N matrix  P 


E 


(4)  Find  the  largest  eigenvalue  X and  the  corresp- 

max 

ending  eigenvector  of  tHe  equation 

Pa  » Xfja  (4. 37) 

If  stop;  otherwise,  continue  to  the  next 

step . 

(5)  Update  a by  selecting  (3^  and  optimally, 

Vl  = \ %ax  C*-38) 

such  that  the  constraint  o£  Eq.  C4.34)  is  satisfied. 
Compute  the  new  information  matrix  and  return  to  (3). 

Example  4.1 

Consider  a first  order  system  with  unknown  input  gain  (e.g., 
model  for  roll  control  of  a missile) 

X a -X  + 9u,  x(0)  » 0 (4.39) 

with  noise  measurements 

y(t)  » x(t)  + v(t)  0<tl2  (4.40) 

v(t)  is  assumed  to  be  white  noise  with  unit  power  spectral  density 
An  input  is  to  be  designed  to  get  a good  estimate  of  parameter  9 . 
The  optimal  input  is  constrained. 


f u2(t)dt  » 1 

n 


(4.41) 


(1)  The  optimal  input  for  this  system  has  been  computed 
previously.  It  is 


58 


u(t)  = 0.59  Csin  1.14t  + 1.14  cos  1.14t) 


C4.42) 


m 


The  inverse  of  the  information  matrix  (the  covariance  of 
parameter  9 ) is  2.29. 

(2")  A suboptimal  input  is  designed;  the  input  is  chosen  a 
sum  of  two  fxinctions 


u(t)  =»  a,s,Ct)  + a^s^Ct-l) 


(4.43) 


where  Sj^(t)  is  a unit  step  function  of  unit  duration 
at  time  t=0.  The  P and  ^ matrices  are 


1/2  0 


0 1/2 


0.343  0.125 


0.125  0.171 


(4.44) 


the  optimal  input  is  easily  shown  to  be 


a^  = 0.884 


a2  = 0.465 


The  inverse  of  the  information  matrix  is  2.45. 

which  is  about  6.7  percent  higher  than  the  optimal  input. 

The  optimal  input  and  the  suboptimal  input  are  shown 
in  Figure  4.1. 
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Figure  4.1  Optimal  and  Suboptimal  Inputs  for  a 

First  Order  System 


4.2  INPUT  DESIGN  FOR  P.4RAMETER  IDENTIFICATION  OF  LINEARIZED 

MODES  AROUND  AN  OPERATING  POINT  - COMPUTATIONAL  TECHNIQUES 


4.2.1  Introduction 


Computational  methods  have  been  a major  hindrance  in  the 
routine  computation  and  use  of  optimal  inputs  in  aerodynamic  flight 
testing.  Efficient  methods  must  be  developed  for  systems  with 
many  state  equations  and  a relatively  large  number  of  parameters. 
This  part  deals  with  the  computational  aspects  of  the  parameter 
identifying  inputs. 


Iterative  algorithms  for  minimizing  the  determinant  or  a lin- 
ear functional  of  the  dispersion  matrix  are  given  in  the  first  part 
Section  4.1,  The  two  major  steps  in  the  algorithms  are  the 
determination  of  an  input  which  maximizes  a linear  function  of 
the  information  matrix  and  the  computation  of  the  information  ma- 
trix for  various  inputs.  An  efficient  method  for  determining  the 
information  matrix  has  been  described  previously.  This  section 


As  before,  let  the  state  equations  and  the  measurement 
equations  be 


— x(t)  * Fx(t)  + Gu(t)  0 < t < T 
dt 


y(t)  =»  Hx(t)  + b + v(t) 


The  information  matrix  for  the  aerodynamic  parameters  is 


.-1  3(Hx) 

Jq  dd  30 


Since 


aCHx)  , 3H  3x 

30^  30^  ^ ” '30' 


and 


d 3x 


dt 


^ 3F  ^ ^ ^ 3G 


If  we  define 


X A Ix^ 

^0  A 1^’  75“ 


T 

3x 

’ 75“ 
m 


The  quadratic  constant  on  the  input  and  the  output  may  also  be 
written  in  terms  of  (with  unit  energy) 


(x^'^(t)TQ'^W^TQX^Ct)  + u'^W2u)dt  = 1 


C4.56) 


It  can  be  shown,  by  an  extension  of  the  results  of  Ref.  16,  that 
to  maximize  Tr(^.VM),  the  following  eigenvalue  problem  must  be 
solved 


dt  i 


r -Ir  T 
-uGcWz 


X^(O)  » \(T)  = 0 


“opt  = 


(4.57) 


(4.58) 


It  is  assumed  here  that  is  nonsingular.  If  it  is,  we  get 

singular  control.  For  example,  when  W-  is  zero,  Eq.  (4.38)  must 
be  replaced  by  G^  \(t)  = 0.  Again,  we  have  the  required  number 
of  equations  to  solve  for  all  unknown'  qualities . The  case  of  the 
singular  control  is  not  discussed. 

The  problem  is  to  find  the  minimum  u such  that  Eq.  (4.57) 
has  a nontrivial  solution  because  1/y  is  the  value  of  the  cor- 
responding performance  index.  The  simplectic  property  of  the 
Hamiltonian  is  used  to  find  a solution  to  the  eigenvalue  problem. 
The  eigenvalues  of  the  state  transition  matrix  occur  in  pairs 
+X  and  -X.  Let  s^  and  s_  be  the  positive  and  negative  sets 
of  eigenvalues  of  this  matrix  with  the  corresponding  partitioned 
eigenvector  matrix 


1 


s = 


X.  X 


A. 


C4.59) 


normalized  such  that 


T T 

A_X^  - X^A^  = I 


(4.60) 


It  can  be  shown  [16]  that  Eq.  (4.37)  has  a nontrivial  solution, 
if 

U = A'^A.e  X'^X/  (4.61) 


has  at  least  one  eigenvalue  equal  to  1.  A random  search,  followed, 
by  a Newton- Raphson  method,  may  be  used  to  find  the  smallest  u 
such  that  an  eigenvalue  of  this  matrix  is  unity.  This  procedure 
is  detailed  in  Ref.  16. 

4,2.3  .Numerical  Computation  of  a Suboptimal  Multistep  Input 


The  algorithm  makes  it  possible  to  compute  the  optimal  in- 
puts for  high  order  systems  with  many  unknown  paramaters.  The 
computation  time  may,  however,  be  quite  high.  From  a practical 
viewpoint  a suboptimal,  multi-step  input  may  be  sufficiently  good. 
In  this  section,  we  develop  an  algorithm  for  selecting  an  input 
from  a class  of  multi-step  inputs.  Suppose  that  an  input  with  s 
steps  each  of  duration  A is  to  be  selected.  The  optimal  input 
may  be  written  as 


u(t)  * 2 a.?.(t) 

i*l  ^ ^ 


0 < 


(4.62) 


N 

V 

,v 


where  *^(t)  is  a unit  step  in  the  interval  (i-l)A<_ 
For  sake  of  simplicity  we  assume  that  there  is  only  one 
multiinput  case  will  be  treated  subsequently.  Let  x^ 
sponse  of  the  reduced  sensitivity  equations  to  unput  | 

6 5 


t 1. 
input.  The 
be  the  re- 
( ^ ) > ( i • ® • > 


5 

I 

'j 


Then  because  of  the  linearity  of  the  system  and  the  f 
?2Ct)  ...?g(t)  are  delayed  ?^(t),  the  state  f 
of  Eq.  (4.62)  may  be  written  as 


X (t)  = I a-x  (t-(i-l)A) 
^ i=l  ^ c 


with  the  understanding  that  x(t)  = 0 for  t < 0.  F 
weighted  trace  of  the  information  matrix  can  be  expre 
of  a 


Tr(WM) 


'T  s 


I a.x  ^Ct-(i-l)  )W*  s a.x^ 
i=l  ^ ^ 


T 

a Pa 


p 


2 { a X ‘Ct-Ci-i:A)Tg  W T Z a.x  rt-(j-l)A) 
Jo  i»l  ^ ^ u 1 u j c 


> Z a Z a.C-Ct) 

i-1  ^ ^ j,l  3 3 


}dt  = 1 


1 . e . , 


(4.68) 


a Qa  = 1 


(4.69) 


where 


Qij  - ^^J(t-(i-l)A)To\ToXjt- (j-l)A) 


+ 5i'.(t)W2?jCt)}  dt 


(4.70) 


.Mote  that  both  P and  Q are  written  in  terms  of  the  single  ti.me 
history  propagation  of  the  reduced  sensitivity  function.  This 
makes  it  possible  to  compute  the  suboptimal  multi-step  input  using 
the  following  steps  (this  is  a special  application  of  Algorithm 
4.3,  Section  4.1). 

Steps  in  the  Computation  of  the  Suboptimal  Multi-Step  Input 

(1)  Using  the  state  equations  and  the  unknown  parameters 
from  the  reduced  sensitivity  functions  Eq.  (4.53), 

(i.e.,  compute  and  T) . 

(2)  Select  the  number  of  steps  s and  propagate  Eq.  (4.63) 

and  store  x . 

c 

(3)  Compute  and  store  Q using  Eq.  (4.70). 

(4)  Select  an  s x 1 vector  a^  and  compute  x^(t)  from 
Eq.  (4.64). 

(3)  Find  the  information  matrix  . 


C6)  Calculate  W based  upon  the  criterion  for  optimization 
and  Mg.  Also  compute  W*. 

(7)  Compute  P using  Eq.  C-^-67). 


(8)  Determine  the  highest  eigenvalue  X and  the  corres- 

max 


ponding  eigenvector  of  the  following 


Pa  - XQa 


(4.71 


(9)  Update  the  design 


^-1 


3l^  " 


(4.72 


such  that  + satisfies  Eq.  (4.70)  and  optimizes 
the  criterion  function.  If  the  criterion  function  im- 
proves very  little,  stop;  otherwise  return  to  step  (5). 


4.3  INPUT  DESIGN  FOR  MODEL  DISCRIMINATION 


Sometimes,  in  complex  processes,  several  competing  models  can 
be  postulated.  Tests  may  then  be  conducted  to  discriminate 
among  these  models.  In  linear  systems  this  may  involve  select- 
ing the  order  of  the  process  and  various  canonical  indices.  This 
problem  is  more  complex  in  nonlinear  systems  where  the  natures  of 
the  nonlinearities  may  have  to  be  discriminated  for,  in  addition 
to  determining  the  order  of  the  process.  The  inputs  used  to  ex- 
cite the  system  during  the  test  may  resolve  the  model  ambiguity 
with  lower  chance  of  error  and  in  a shorter  test  time. 


4.3.1  Problem  Statement 


A process  may  follow  one  of  several  models,  1,2,...N.  The 
ith  model  may  be  described  by 


^i  * + w^,  x(0)  =0  0 ^ t ^ T (4.73 


I 


J 

V 

‘J 


with  measurements, 

Xi  - h^fx^.u.e^)  * 

is  the  state  vector  for  the  ith  model  and  0.  is  a set  of 
unknown  parameters  in  the  model,  w^  and  are  random  white 

noise  sources  with  power  spectral  densitites  of  Q and  R,  re- 
spectively (known  or  unknown).  The  problem  is  to  determine  the 
most  likely  model  given  a certain  input  u and  the  corresponding 
response  y. 

For  the  sake  of  simplicity,  we  assume  that  there  are  two  com 
peting  models.  The  case  of  multiple  models  is  a straightforward 
extension . 

4.3.2  The  Criterion  Function 


Let  us  suppose  that  the  two  dynamic  models  are 


0 

'.lAi 


.V. 

r-'-: 


i 


» 


i 


Model  4.1 


» fj_(Xj_,U,0^) 


Yl  » h^(x^,u,9j_)  + v^ 


(4.74) 


Model  4.2 


x^  » f2(x2,u,02) 


= h2(x2,u,02)  V2 


C4.75) 


Note  that  the  two  state  vectors  may  be  of  different  dimensions  and 
the  number  of  unknovm  parameters  in  the  two  models  may  be  different. 
We  have  assumed  that  the  process  noise  does  not  exist,  because  the 
extension  to  the  process  noise  case  may  be  obtained  directly  by 
using  the  innovations  representation. 
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Suppose  an  input  u is  applied  to  the  system  and  an  output 
y observed.  The  likelihood  that  the  first  model  is  the  true 
model  is 


Si  =•  pC/j^-y/u.e^.M^^) 


(i.76] 


The  negative  logarithm  of  this  likelihood  function  is 

j ^ { Cy-hj_Cx^,u,9j_))'^Rj_‘^(y-h^(x^,u,0^)) 


^ log  1R^!>  dt 


C‘V.77] 


The  negative  log-likelihood  function  of  the  second  model  being 
the  true  model  is  also  computed  in  a similar  fashion.  We  begin 
with  the  assumption,  that  there  are  no  unknown  parameters  in  either 
model.  Then,  if  is  the  true  model,  the  expected  values  of 

and  L2  are 

ECL^CM^^))  > E J 


+ log  I R^ I } dt 

» - (p  + log  I Ri  1 )T 
2 ^ 

E(L2CMj_))  - E i 

^ Jo 


C4. ' 3) 


log  1 R^l  } dt 

1 CTr(R^R2'h  + log  |R2l)T 


2 Jo 


Ch2-h2)V‘^(h^-h2)dt 


«.svi 


(4.84; 
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ECL^CM^))  = - (p  + log  IR^HT  - 


where  is  the  number  o£  unknown  parameters  in  the  first  model. 
Computation  of  ECL2(Mj^))  is  more  difficult.  .Again  the  estimated 
value  of  Qy  follows  the  equation 


3h. 


30 


(4. 85] 


Note  that  there  could  be  a large  error  in  the  estimated  value  of 


9^  because  an  incorrect  model  is  used  in  the  identification. 


Let  9, 


be  the  estimated  value  of  9. 


2^  .^iUA  U C V.4  V a X LIW  ^ '*^2 

the  measurement.  Assuming  9^  is  close  to 
show  that 


if  there  is  no  noise  in 
9,,  it  is  easy  to 


ECL^CM^))  =•  i (Tr(R3_R2‘^)  + log  IR2DT  - ^ m2 
2 2 


* 7 I -h2C9n”S'^»iC®i) 


(4.86) 


22^  is  a parameter  vector  such  that  h2(023_)  is  the  projection 
of  h2(02)  on  If  we  take  the  difference  of  E(L^CM^)). 

A criterion  similar  to  that  of  Eq.  (4.80)  will  be  obtained. 

Again,  we  must  maximize  the  last  term  in  the  above  equation.  Note 


that  &2i  depends  upon  the  applied  input. 
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4.3.3  Optimal  Input  Selection 


The  above  discussion  shows  that  an  input  which  maximizes  J 
must  be  selected. 


1 

'^“77  -^2 (x^.u,  e^i:)  1 1 


(4.87] 


where  9,,  is  given  by 


where  x^  and  X2  follow  the  equations 


(4.88) 


(4.89) 


X2  » f2(x2,u,022)  X2(0)  = 0 


(4.90) 


The  constraints  of  the  equation  can  be  removed  by  selecting  an 
state  variable  vector  x,. 


X-  =* 


X3(0)  - X3(T)  » 0 


(4.91) 


Let  us  assume  that  u must  belong  to  a set  of  real  function  R. 
Then,  the  problem  is  to 


We  wish  to  s 
second  durat 
steps  be  of 
to  show  that 


A straightforward  computation  gives 


(Xj_-X2)^  dt  = 0.5428a^^  - 0.0228a^a,  + 0.04428a^^ 


If  there  is  a total  energy  constraint  on  the  input  the  opti- 
mal input  is  such  that 


1 - 1 - 
= -1.30 


4.4  CONCLUSIONS 


This  chapter  presented  various  techniques  for  designing  tests 
which  make  best  use  of  the  experiment  time  in  terms  of  obtaining 
the  most  accurate  parameter  estimates  within  the  constraints  of 
the  system.  Various  methodologies  for  conducting  tests  on  nonlin- 
ear systems  are  discussed  first.  This  includes  direct  estimation, 
of  the  nonlinear  relationships  and  of  conducting  tests  at  several 
operating  points  and  estimating  the  small  signal  (linearized)  mo- 
del at  each  point.  Input  design  techniques  for  both  cases  are 
covered.  A special  emphasis  is  given  to  the  determination  of  in- 
puts which  are  suboptimal,  but  easy  to  compute. 


In  nonlinear  systems,  we  often  need  model  discriminating  in- 
puts. Methods  for  the  selection  of  such  inputs  are  covered  in 
detail . 


V.  CONCLUSIONS 


To  extend  the  application  of  system  identification  technology 
to  complex  Navy  systems,  two  major  problems  have  been  solved. 

These  are;  (a)  the  development  of  general  techniques  to  determine 
the  structure  of  a nonlinear  model  which  best  explains  an  input/ 
output  data,  and  (b)  the  development  of  methods  for  designing 
inputs  and  maneuvers  which  adequately  excite  each  mode  to  provide 
accurate  estimates  of  all  unknown  parameters  in  the  identification 
stage . 

An  effective  model  structure  determination  technique  is  devel- 
oped as  a combination  of  two  methods 

(1)  Selection  of  parameterization:  This  method  selects 

a set  or  important  parameters  from  a general  model. 

The  parameters  may  correspond  to  coupling  between 
various  parts  of  the  model  or  the  order  of  dynamics 
in  the  model. 

'2)  Spline  representation  of  nonlinearities:  This  method 

uses  a general  spline  representation  tor  unknown  non- 
linearities  and  then  applies  optimal  subset  regression 
to  select  the  appropriate  terms  from  the  general  spline 
representation. 

The  input  design  technique  for  nonlinear  systems  is  a combin- 
ation of  the  following  methods. 

(1)  Inputs  for  small  motions  around  operating  points: 

Several  operating  points  are  selected  to  cover  the  non- 
linear operating  region.  Inputs  are  designed  for  small 
motions  around  these  operating  points. 

(2)  Computational  techniques  using  orthogonal  functions: 
Computational  techniques,  based  on  representing  the 
input  as  a sum  of  orthogonal  functions,  are  developed 
to  enable  determination  of  optimal  inputs  for  complex 
nonlinear  systems. 

(3)  Optimal  inputs  for  model  discrimination:  Inputs,  which 

enhance  the  distinguishability  among  several  competing 
models,  are  designed. 


Ln*j 


K_.\ 


.V.J 


These  methods  have  been  used  with  simulation  data  and  are 
currently  being  implemented  for  flight  tost  data  application. 

The  results  of  the  present  study  lead  to  the  following  major 
conclus ions : 


(1) 


Model  structure  determination  is  an  important  part  of 
the  system  identification  problem  in  complex  nonlinear 
systems.  Techniques  for  model  structure  determination 
from  noisy  data  have  been  developed  and  are  computation- 
ally feasible. 


(2) 


Inputs  must  be  carefully  selected  to  be  able  to  esti- 
mate the  model  structure  and  the  parameters  accurately. 
Techniques  for  designing  inputs  to  improve  parameter 
estimation  accuracy  or  model  distinguishability  have 
been  develooed. 
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